首页> 外文OA文献 >A new tool called DISSECT for analyzing large genomic datasets using a Big Data approach
【2h】

A new tool called DISSECT for analyzing large genomic datasets using a Big Data approach

机译:一种名为DIssECT的新工具,用于使用大数据方法分析大型基因组数据集

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Large-scale genetic and genomic data are increasingly available and the major bottleneck in their analysis is a lack of sufficiently scalable computational tools. To address this problem in the context of complex traits analysis, we present DISSECT. DISSECT is a new and freely available software that is able to exploit the distributed-memory parallel computational architectures of compute clusters, to perform a wide range of genomic and epidemiologic analyses, which currently can only be carried out on reduced sample sizes or under restricted conditions. We demonstrate the usefulness of our new tool by addressing the challenge of predicting phenotypes from genotype data in human populations using mixed-linear model analysis. We analyse simulated traits from 470,000 individuals genotyped for 590,004 SNPs in ∼4 h using the combined computational power of 8,400 processor cores. We find that prediction accuracies in excess of 80% of the theoretical maximum could be achieved with large sample sizes.
机译:大规模的遗传和基因组数据越来越多,其分析的主要瓶颈是缺乏足够可扩展的计算工具。为了在复杂特征分析的背景下解决这个问题,我们提出了DISSECT。 DISSECT是一种新的免费软件,能够利用计算集群的分布式内存并行计算架构来执行广泛的基因组和流行病学分析,目前只能在减少样本量或受限条件下进行。我们通过使用混合线性模型分析解决了人类基因型数据预测表型的挑战,从而证明了我们新工具的实用性。我们使用8400个处理器核的组合计算能力,在约4 h的时间内对470,000个基因型590,004个SNP的基因型进行了模拟分析。我们发现,使用大样本量时,可以达到超过理论最大值的80%的预测精度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号